Precise Regression Benchmarking with Random Effects: Improving Mono Benchmark Results
نویسندگان
چکیده
Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctuations caused by compilation. We show that the design of a benchmarking experiment must reflect the existence of the fluctuations if the performance observed during the experiment is to be representative of reality. We present a new statistical model of a benchmark experiment that reflects the presence of the fluctuations in compilation, execution and measurement. The model describes the observed performance and makes it possible to calculate the optimum dimensions of the experiment that yield the best precision within a given amount of time. Using a variety of benchmarks, we evaluate the model within the context of regression benchmarking. We show that the model significantly decreases the number of erroneously detected performance changes in regression benchmarking.
منابع مشابه
Mono Regression Benchmarking
Regression benchmarking is a methodology for detecting performance changes in software by periodic benchmarking. Detecting performance regressions in particular helps to improve software quality, similarly as regression testing, which however focuses only on software functionality. To achieve an acceptable level of false alarms, regression benchmarking requires a statistically sound planning an...
متن کاملAn Efficiency Measurement and Benchmarking Model Based on Tobit Regression, GANN-DEA and PSOGA
The purpose of this study is designing a model based on Tobit regression, DEA, Artificial Neural Network, Genetic Algorithm and Particle Swarm Optimization to evaluate the efficiency and also benchmarking the efficient and inefficient units. This model has three stages, and it uses the data envelopment analysis combined model with neural network, optimized by genetic algorithm, to evaluate the ...
متن کاملQuality Assurance in Performance: Evaluating Mono Benchmark Results
Performance is an important aspect of software quality. To prevent performance degradation during software development, performance can be monitored and software modifications that damage performance can be reverted or optimized. Regression benchmarking provides means for an automated monitoring of performance, yielding a list of software modifications potentially associated with performance ch...
متن کاملRandom forest versus logistic regression: a large-scale benchmark experiment
The Random Forest (RF) algorithm for regression and classification has considerably gained popularity since its introduction in 2001. Meanwhile, it has grown to a standard classification approach competing with logistic regression in many innovation-friendly scientific fields. In this context, we present a large scale benchmarking experiment based on 260 real datasets comparing the prediction p...
متن کاملPractical benchmarking in DEA using artificial DMUs
Data envelopment analysis (DEA) is one of the most efficient tools for efficiency measurement which can be employed as a benchmarking method with multiple inputs and outputs. However, DEA does not provide any suggestions for improving efficient units, nor does it provide any benchmark or reference point for these efficient units. Impracticability of these benchmarks under environmental conditio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006